Fork me on GitHub

Product Recommendation after purchase – Does apriori make better recommendation?



###Introduction

Durning last internship, I use apriori to improved the recommendation system of my company, KKday. KKday is the leading e-commerce travel platform in asia. In this article I am going to use the sales data from KKday to illustrate the performance and difference of apriori and distance-based recommendation system.



### Data Preprocessing

require(dplyr)
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
require('DT')
## Loading required package: DT

The Original Data

head(data,10)
##     X prod_oid user_id
## 1   1     8332  132591
## 2   2     3598  132591
## 3   3     8332   58804
## 4   4    12808   58804
## 5   5    18073   55631
## 6   6     9987   55631
## 7   7    17772   55631
## 8   8    12049  142657
## 9   9    17756  142657
## 10 10     7505  198893

The data contains products which has been ordered by users. ‘product_oid’ is the code of each product, while user_id is user who bought the product. For example, we can say user ‘132591’ has bought product ‘8332’ and ‘3598’ together.

Combine Orders by Same User

df <- data %>%
  group_by(user_id) %>%
  summarise(prod_oid_paste = paste(prod_oid, collapse=" "),
            n = n()) %>% filter(n >1) #remove order that only contain one product
head(df)
## # A tibble: 6 x 3
##   user_id prod_oid_paste     n
##     <int> <chr>          <int>
## 1       1 2808 11971         2
## 2       2 10999 2173         2
## 3       3 2689 2686          2
## 4       4 17696 13367        2
## 5       5 18350 2716         2
## 6       6 17975 18576        2
retail.list <-  df
#Seperate by ""
retail.list <- sapply(retail.list$prod_oid_paste,strsplit, " ")
head(retail.list)
## $`2808 11971`
## [1] "2808"  "11971"
## 
## $`10999 2173`
## [1] "10999" "2173" 
## 
## $`2689 2686`
## [1] "2689" "2686"
## 
## $`17696 13367`
## [1] "17696" "13367"
## 
## $`18350 2716`
## [1] "18350" "2716" 
## 
## $`17975 18576`
## [1] "17975" "18576"

The data has to be transfromed in to ‘transaction’ type in order to fit in the packages, arules, which we will explore later.

Therefore, we have to group data by user_id, and paste the orders together. Now, the original data.frame has transformed into list, and each row means a market basket ordered by certain customer.

Transfrom Order Data into Transaction Data

require(arules)
## Loading required package: arules
## Loading required package: Matrix
## 
## Attaching package: 'arules'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following objects are masked from 'package:base':
## 
##     abbreviate, write
retail.trans <- as(retail.list, "transactions")
summary(retail.trans)
## transactions as itemMatrix in sparse format with
##  222845 rows (elements/itemsets/transactions) and
##  5067 columns (items) and a density of 0.000498097 
## 
## most frequent items:
##    2674    2685    7423    8332    2173 (Other) 
##   12115   11950    9853    9149    7985  511377 
## 
## element (itemset/transaction) length distribution:
## sizes
##      2      3      4      5      6      7      8      9     10     11 
## 148855  47268  16913   6111   2241    899    331    114     54     22 
##     12     13     14     15     16     17     23     26 
##     14     10      5      2      2      2      1      1 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   2.000   2.000   2.524   3.000  26.000 
## 
## includes extended item information - examples:
##   labels
## 1  10000
## 2  10005
## 3  10007
## 
## includes extended transaction information - examples:
##   transactionID
## 1    2808 11971
## 2    10999 2173
## 3     2689 2686

By transforming into transactions data and using summary function, we can see product ‘2674’ is the most frequent product which appeared in 12115 customers’ orders. And the median product in customers’ orders is 2 -> at least 50 % people have only two product in each order.

Applying Association Rules

Depends on researcher’s experience and the purpose, we have to set three parameters in arules: confidence, support, and lift, to extract meaninful patterns.

Here we are going to set support, confidece as threshold, which is common in most research.

Setting Parameters to extract frequent patterns

sup = 0.0001
conf = 0.1
retail.rules <- apriori(retail.trans, parameter=list(supp=sup, conf=conf))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.1    0.1    1 none FALSE            TRUE       5   1e-04      1
##  maxlen target   ext
##      10  rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 22 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[5067 item(s), 222845 transaction(s)] done [0.08s].
## sorting and recoding items ... [1488 item(s)] done [0.01s].
## creating transaction tree ... done [0.14s].
## checking subsets of size 1 2 3 4 done [0.02s].
## writing ... [4400 rule(s)] done [0.00s].
## creating S4 object  ... done [0.04s].

Knowing that there are thousands of products on KKday, we set a conservative threshould to secure that we could have enough patterns for recommendation. And we get 4400 association rules eventually.

Visualize the support/confidence distribution with arulesViz

# install.packages("arulesViz")
library(arulesViz)
## Loading required package: grid
arulesViz::plotly_arules(retail.rules)
## Warning: 'arulesViz::plotly_arules' is deprecated.
## Use 'plot' instead.
## See help("Deprecated")
## Warning: plot: Too many rules supplied. Only plotting the best 1000 rules
## using measure lift (change parameter max if needed)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.

This interactive visualization tools can help us determine the parameters. By observing the distribution and the number of rules, we can see whether to increase the threshold or not.

Which products are most frequently bought together

retail.conf <- head(sort(retail.rules, by="confidence"), 20)
inspect(retail.conf)
##      lhs                 rhs     support      confidence lift       count
## [1]  {12225}          => {11359} 0.0001256479 1.0000000  1714.19231   28 
## [2]  {1446,2358,2768} => {2914}  0.0001032108 0.9583333    60.44715   23 
## [3]  {1859}           => {1853}  0.0001435976 0.9411765   134.53269   32 
## [4]  {2791}           => {2143}  0.0001256479 0.9032258   426.43931   28 
## [5]  {11912}          => {9735}  0.0002468083 0.9016393   252.73688   55 
## [6]  {1446,2612,7423} => {2914}  0.0001076982 0.8888889    56.06692   24 
## [7]  {1862}           => {1853}  0.0009872333 0.8627451   123.32164  220 
## [8]  {12784,2674}     => {2685}  0.0002782203 0.8611111    16.05810   62 
## [9]  {1446,2768,7423} => {2914}  0.0002064215 0.8518519    53.73080   46 
## [10] {1446,2878,7423} => {2914}  0.0001211604 0.8437500    53.21978   27 
## [11] {1446,2930}      => {2914}  0.0001615473 0.8372093    52.80722   36 
## [12] {4239,4627,5260} => {5925}  0.0001121856 0.8333333    61.22788   25 
## [13] {1446,2768,2878} => {2914}  0.0001076982 0.8275862    52.20024   24 
## [14] {8416}           => {8427}  0.0002019341 0.8181818   302.36771   45 
## [15] {5260}           => {5925}  0.0069958940 0.8140992    59.81469 1559 
## [16] {1446,1922}      => {2914}  0.0002243712 0.8064516    50.86717   50 
## [17] {12822,2674}     => {2685}  0.0001256479 0.8000000    14.91849   28 
## [18] {1446,2612}      => {2914}  0.0003006574 0.7976190    50.31005   67 
## [19] {2479,2843}      => {2312}  0.0001750095 0.7959184    83.62396   39 
## [20] {2583,2674}      => {2685}  0.0008840225 0.7943548    14.81322  197

By sorting the rules from highest confidence, we can see that the product ‘12225’ has 100% chance being bought together with 11359, yet this combination only has been bought for 28 times, which only count for 0.01% of total orders. On the other hand, Product ‘5260’ has 81% chance being bought together with ‘5925’, and ‘1559’ people have bought the same bundle. This means that we could to recommend ‘1559’ to any those customer who has bought ‘5925’.

What are the patterns that contain most products

rules_length <- lapply(LIST(retail.rules@lhs), function(x) unlist(strsplit(x, " ")))
retail_long <- head(retail.rules[order(lengths(rules_length),retail.rules@quality$confidence,decreasing = TRUE)],20)
inspect(retail_long)
##      lhs                    rhs     support      confidence lift     count
## [1]  {1446,2358,2768}    => {2914}  0.0001032108 0.9583333  60.44715 23   
## [2]  {1446,2612,7423}    => {2914}  0.0001076982 0.8888889  56.06692 24   
## [3]  {1446,2768,7423}    => {2914}  0.0002064215 0.8518519  53.73080 46   
## [4]  {1446,2878,7423}    => {2914}  0.0001211604 0.8437500  53.21978 27   
## [5]  {4239,4627,5260}    => {5925}  0.0001121856 0.8333333  61.22788 25   
## [6]  {1446,2768,2878}    => {2914}  0.0001076982 0.8275862  52.20024 24   
## [7]  {13903,13952,17927} => {13900} 0.0001615473 0.7826087  71.06782 36   
## [8]  {2459,2843,4016}    => {2312}  0.0001211604 0.7714286  81.05092 27   
## [9]  {13900,13903,17927} => {13952} 0.0001615473 0.7659574  73.10055 36   
## [10] {2467,2843,4016}    => {2312}  0.0001525724 0.7391304  77.65748 34   
## [11] {13903,13952,7452}  => {13900} 0.0001525724 0.7391304  67.11961 34   
## [12] {11731,13903,13952} => {13900} 0.0002557832 0.7307692  66.36034 57   
## [13] {4227,4627,5260}    => {5925}  0.0001032108 0.7187500  52.80905 23   
## [14] {2459,2467,2843}    => {2312}  0.0001211604 0.6923077  72.73800 27   
## [15] {2459,2467,4016}    => {2312}  0.0002198838 0.6901408  72.51034 49   
## [16] {2287,2685,8332}    => {2674}  0.0001929592 0.6825397  12.55473 43   
## [17] {13900,13903,17688} => {13952} 0.0001346227 0.6521739  62.24141 30   
## [18] {17756,2674,8332}   => {2685}  0.0001211604 0.6428571  11.98808 27   
## [19] {11731,13900,17688} => {13952} 0.0001032108 0.6388889  60.97353 23   
## [20] {11847,18608,2322}  => {18073} 0.0001480850 0.6226415  20.75580 33

We can see that the first 4 patterns of rhs are product ‘2914’, meaning these products often bought together.

Network Graph

plot(retail.rules, method="graph", control=list(type="items"))
## Available control parameters (with default values):
## main  =  Graph for 100 rules
## nodeColors    =  c("#66CC6680", "#9999CC80")
## nodeCol   =  c("#EE0000FF", "#EE0303FF", "#EE0606FF", "#EE0909FF", "#EE0C0CFF", "#EE0F0FFF", "#EE1212FF", "#EE1515FF", "#EE1818FF", "#EE1B1BFF", "#EE1E1EFF", "#EE2222FF", "#EE2525FF", "#EE2828FF", "#EE2B2BFF", "#EE2E2EFF", "#EE3131FF", "#EE3434FF", "#EE3737FF", "#EE3A3AFF", "#EE3D3DFF", "#EE4040FF", "#EE4444FF", "#EE4747FF", "#EE4A4AFF", "#EE4D4DFF", "#EE5050FF", "#EE5353FF", "#EE5656FF", "#EE5959FF", "#EE5C5CFF", "#EE5F5FFF", "#EE6262FF", "#EE6666FF", "#EE6969FF", "#EE6C6CFF", "#EE6F6FFF", "#EE7272FF", "#EE7575FF",  "#EE7878FF", "#EE7B7BFF", "#EE7E7EFF", "#EE8181FF", "#EE8484FF", "#EE8888FF", "#EE8B8BFF", "#EE8E8EFF", "#EE9191FF", "#EE9494FF", "#EE9797FF", "#EE9999FF", "#EE9B9BFF", "#EE9D9DFF", "#EE9F9FFF", "#EEA0A0FF", "#EEA2A2FF", "#EEA4A4FF", "#EEA5A5FF", "#EEA7A7FF", "#EEA9A9FF", "#EEABABFF", "#EEACACFF", "#EEAEAEFF", "#EEB0B0FF", "#EEB1B1FF", "#EEB3B3FF", "#EEB5B5FF", "#EEB7B7FF", "#EEB8B8FF", "#EEBABAFF", "#EEBCBCFF", "#EEBDBDFF", "#EEBFBFFF", "#EEC1C1FF", "#EEC3C3FF", "#EEC4C4FF", "#EEC6C6FF", "#EEC8C8FF",  "#EEC9C9FF", "#EECBCBFF", "#EECDCDFF", "#EECFCFFF", "#EED0D0FF", "#EED2D2FF", "#EED4D4FF", "#EED5D5FF", "#EED7D7FF", "#EED9D9FF", "#EEDBDBFF", "#EEDCDCFF", "#EEDEDEFF", "#EEE0E0FF", "#EEE1E1FF", "#EEE3E3FF", "#EEE5E5FF", "#EEE7E7FF", "#EEE8E8FF", "#EEEAEAFF", "#EEECECFF", "#EEEEEEFF")
## edgeCol   =  c("#474747FF", "#494949FF", "#4B4B4BFF", "#4D4D4DFF", "#4F4F4FFF", "#515151FF", "#535353FF", "#555555FF", "#575757FF", "#595959FF", "#5B5B5BFF", "#5E5E5EFF", "#606060FF", "#626262FF", "#646464FF", "#666666FF", "#686868FF", "#6A6A6AFF", "#6C6C6CFF", "#6E6E6EFF", "#707070FF", "#727272FF", "#747474FF", "#767676FF", "#787878FF", "#7A7A7AFF", "#7C7C7CFF", "#7E7E7EFF", "#808080FF", "#828282FF", "#848484FF", "#868686FF", "#888888FF", "#8A8A8AFF", "#8C8C8CFF", "#8D8D8DFF", "#8F8F8FFF", "#919191FF", "#939393FF",  "#959595FF", "#979797FF", "#999999FF", "#9A9A9AFF", "#9C9C9CFF", "#9E9E9EFF", "#A0A0A0FF", "#A2A2A2FF", "#A3A3A3FF", "#A5A5A5FF", "#A7A7A7FF", "#A9A9A9FF", "#AAAAAAFF", "#ACACACFF", "#AEAEAEFF", "#AFAFAFFF", "#B1B1B1FF", "#B3B3B3FF", "#B4B4B4FF", "#B6B6B6FF", "#B7B7B7FF", "#B9B9B9FF", "#BBBBBBFF", "#BCBCBCFF", "#BEBEBEFF", "#BFBFBFFF", "#C1C1C1FF", "#C2C2C2FF", "#C3C3C4FF", "#C5C5C5FF", "#C6C6C6FF", "#C8C8C8FF", "#C9C9C9FF", "#CACACAFF", "#CCCCCCFF", "#CDCDCDFF", "#CECECEFF", "#CFCFCFFF", "#D1D1D1FF",  "#D2D2D2FF", "#D3D3D3FF", "#D4D4D4FF", "#D5D5D5FF", "#D6D6D6FF", "#D7D7D7FF", "#D8D8D8FF", "#D9D9D9FF", "#DADADAFF", "#DBDBDBFF", "#DCDCDCFF", "#DDDDDDFF", "#DEDEDEFF", "#DEDEDEFF", "#DFDFDFFF", "#E0E0E0FF", "#E0E0E0FF", "#E1E1E1FF", "#E1E1E1FF", "#E2E2E2FF", "#E2E2E2FF", "#E2E2E2FF")
## alpha     =  0.5
## cex   =  1
## itemLabels    =  TRUE
## labelCol  =  #000000B3
## measureLabels     =  FALSE
## precision     =  3
## layout    =  NULL
## layoutParams  =  list()
## arrowSize     =  0.5
## engine    =  igraph
## plot  =  TRUE
## plot_options  =  list()
## max   =  100
## verbose   =  FALSE

The network graph shows associations between selected products. Larger circles imply higher support, while red circles imply higher lift.

  • The most popular order was of ‘2674’ and ‘2685’, another popular orders was of ‘2689’ and ‘2685’

  • If someone buys ‘17899’, he is likely to have bought 4051 as well
  • Relatively many people buy ‘5260’ along with ‘sliced cheese’5925’ (1559 times)

#-------------Form Function for product recommendation--------------
#Test any basket you like
new_basket = c('7781')
next_buy = function(new_basket){
  it_new_basket = as(list(new_basket), "itemMatrix")
# find all rules, where the lhs is a subset of the current new_basket
  rulesMatchLHS <- is.subset(retail.rules@lhs,it_new_basket)
# and the rhs is NOT a subset of the current new_basket (so that some items are left as potential recommendation)
  suitableRules <-  rulesMatchLHS & !(is.subset(retail.rules@rhs,it_new_basket))
  possible_recomed = retail.rules[as.logical(suitableRules)]
  if(length(possible_recomed)==0){
    print('No association rules pass the threshold, consider other possible combination ')
  }else{
    # now extract the matching rhs ...
    # recommendations <- strsplit(LIST(possible_recomed@lhs)[order(lengths(lst1), decreasing = TRUE)][[1]],split=" ")
    lst1 <- lapply(LIST(possible_recomed@lhs), function(x) unlist(strsplit(x, " ")))
    # lst2 <- order(possible_recomed@quality$confidence,decreasing = TRUE)
    # LIST(possible_recomed@lhs)[order(lengths(lst1), decreasing = TRUE)]  
    recommendations <- strsplit(LIST(possible_recomed@rhs)[[order(possible_recomed@quality$confidence, decreasing = TRUE)[1]]],split=" ")
    print("Potential recommendations are...")
    inspect(possible_recomed[order(possible_recomed@quality$confidence, decreasing = TRUE),])
     recommendations <- lapply(recommendations,function(x){paste(x,collapse=" ")})
    recommendations <- as.character(recommendations)
    print(paste("Best recommendation would be ",recommendations))
    return(as.character(recommendations))
  }
}

————–Recommendation system over=————-

Reverse recommendation: What other products would lead to this one

target_one = 11731 rules<-apriori(data=retail.trans, parameter=list(supp=sup,conf = conf,minlen=2), appearance = list(default=“lhs”,rhs=target_one), control = list(verbose=F)) rules<-sort(rules, decreasing=TRUE,by=“confidence”) inspect(rules) next_buy(target_one)

Test some basket

next_buy(“11731”) basket = c(“2014”,“2674”) next_buy(basket)

Comparison with consine distance

DT_filter =  data
product_list <- unique(DT_filter$prod_oid)
user_list <- unique(DT_filter$user_id)
product_list_len <- length(product_list)
user_list_len <- length(user_list)
prod_len <- c(1:product_list_len)

prod_user_temp <- Matrix(rep(0,user_list_len), nrow = 1)
prod_user <- Matrix(rep(0,user_list_len), nrow = 1)
prod_user <- Matrix(0, nrow =product_list_len ,ncol = user_list_len)

# for(i in 2 : product_list_len){
#   prod_user <<- rbind(prod_user, prod_user_temp)
# }
colnames(prod_user) <- user_list

temp <- sapply(prod_len, function(x){
  DT_filter_i_user <- DT_filter %>%
    filter(prod_oid == product_list[x]) %>%
    select(user_id)
  DT_filter_i_user_v <- as.vector(t(DT_filter_i_user))
  prod_user[x,DT_filter_i_user_v] <<- 1
  rm(DT_filter_i_user, DT_filter_i_user_v)
  if(x %% 500 == 0){
    print(paste0(x,'/',product_list_len))
  }
})
## [1] "500/5067"
## [1] "1000/5067"
## [1] "1500/5067"
## [1] "2000/5067"
## [1] "2500/5067"
## [1] "3000/5067"
## [1] "3500/5067"
## [1] "4000/5067"
## [1] "4500/5067"
## [1] "5000/5067"
print("prod_user matrix finish")
## [1] "prod_user matrix finish"
rm(DT_filter,temp)
DT_filter_matrix <- prod_user
row.names(DT_filter_matrix) <- product_list
rm(prod_user)



DT_rowSums <- rowSums(DT_filter_matrix)
DT_rowSums_under_5 <- names(DT_rowSums)[DT_rowSums<=5]
rm(DT_rowSums)

print("start calculating cosine_score_matrix")
## [1] "start calculating cosine_score_matrix"
xxt <- DT_filter_matrix %*% t(DT_filter_matrix)
diag_xxt <- sqrt(diag(1/diag(xxt)))
score_matrix <- diag_xxt %*% xxt %*% diag_xxt
rownames(score_matrix) <- rownames(DT_filter_matrix)
colnames(score_matrix) <- rownames(DT_filter_matrix)
rm(xxt,diag_xxt)
score_matrix <- as.matrix(score_matrix)

DT_similar_prod <- data.frame(similar_prod_oid = rep(NA_real_,product_list_len*20),
                              score =  rep(NA_real_,product_list_len*20),
                              prod_oid = rep(NA_real_,product_list_len*20))

t1 <- Sys.time()
temp <- sapply(prod_len, function(x){
  DT_similar_prod[(20*x-19):(20*x),] <<-
    data.frame(score = score_matrix[x,],
               similar_prod_oid = product_list)[-x,] %>%
    arrange(desc(score)) %>%
    mutate(score_order = c(1:(product_list_len-1))) %>%
    filter(score_order <=20) %>%
    select(similar_prod_oid, score) %>%
    mutate(prod_oid = product_list[x])
  if(x %% 500 == 0){
    print(paste0(x,'/',product_list_len))
  }
} )
## [1] "500/5067"
## [1] "1000/5067"
## [1] "1500/5067"
## [1] "2000/5067"
## [1] "2500/5067"
## [1] "3000/5067"
## [1] "3500/5067"
## [1] "4000/5067"
## [1] "4500/5067"
## [1] "5000/5067"
rm(temp)

gc()
##            used  (Mb) gc trigger  (Mb)  max used  (Mb)
## Ncells  2672580 142.8    4703850 251.3   4703850 251.3
## Vcells 33402184 254.9  106178485 810.1 110423644 842.5
print("End of calculation cosine_score_matrix")
## [1] "End of calculation cosine_score_matrix"

Cosine distance vs. Apriori

#Apriori result
#Recommendation after buying

target_one = c(1446)
next_buy(target_one)
## [1] "Potential recommendations are..."
##     lhs       rhs     support      confidence lift      count
## [1] {1446} => {2914}  0.0060131482 0.7600681  47.941514 1340 
## [2] {1446} => {7423}  0.0011846799 0.1497448   3.386773  264 
## [3] {1446} => {2768}  0.0010231327 0.1293250   7.054941  228 
## [4] {1446} => {12245} 0.0008750477 0.1106069  19.750160  195 
## [1] "Best recommendation would be  2914"
## [1] "2914"
#Cosine result
head(DT_similar_prod[DT_similar_prod$prod_oid==target_one,],10)
##     similar_prod_oid      score prod_oid
## 361             2914 0.53691659     1446
## 362            12245 0.13146228     1446
## 363            11293 0.09779082     1446
## 364             2768 0.08495964     1446
## 365             7423 0.06334226     1446
## 366             2878 0.06225813     1446
## 367             2881 0.05335821     1446
## 368            11866 0.04767070     1446
## 369             2603 0.04125782     1446
## 370             2948 0.04000302     1446
#Reverse recommendation: Product lead to this product

rules<-apriori(data=retail.trans, parameter=list(supp=sup,conf = conf,minlen=2), 
               appearance = list(default="lhs",rhs=target_one),
               control = list(verbose=F))
rules<-sort(rules, decreasing=TRUE,by="confidence")
head(inspect(rules))
##      lhs                 rhs    support      confidence lift     count
## [1]  {2612,2914,7423} => {1446} 0.0001076982 0.5714286  72.22915   24 
## [2]  {11293}          => {1446} 0.0001391101 0.5438596  68.74441   31 
## [3]  {2768,2914,7423} => {1446} 0.0002064215 0.4339623  54.85327   46 
## [4]  {2881,2914}      => {1446} 0.0003589939 0.4145078  52.39421   80 
## [5]  {12245,2768}     => {1446} 0.0001166730 0.4126984  52.16550   26 
## [6]  {2813,2914}      => {1446} 0.0001121856 0.4032258  50.96815   25 
## [7]  {2878,2914,7423} => {1446} 0.0001211604 0.3913043  49.46127   27 
## [8]  {2914}           => {1446} 0.0060131482 0.3792811  47.94151 1340 
## [9]  {2603,2914}      => {1446} 0.0001929592 0.3739130  47.26299   43 
## [10] {2612,2914}      => {1446} 0.0003006574 0.3701657  46.78933   67 
## [11] {2914,7423}      => {1446} 0.0009199219 0.3422371  43.25911  205 
## [12] {2878,2914}      => {1446} 0.0004711795 0.3343949  42.26786  105 
## [13] {2914,7452}      => {1446} 0.0001435976 0.3106796  39.27022   32 
## [14] {1922,2914}      => {1446} 0.0002243712 0.2941176  37.17677   50 
## [15] {2768,2878,2914} => {1446} 0.0001076982 0.2926829  36.99542   24 
## [16] {1919,2914}      => {1446} 0.0001435976 0.2857143  36.11458   32 
## [17] {2768,2914}      => {1446} 0.0007673495 0.2722930  34.41811  171 
## [18] {2358,2768,2914} => {1446} 0.0001032108 0.2705882  34.20263   23 
## [19] {2914,2948}      => {1446} 0.0002468083 0.2350427  29.70964   55 
## [20] {2914,2930}      => {1446} 0.0001615473 0.2307692  29.16947   36 
## [21] {2358,2914}      => {1446} 0.0003589939 0.2173913  27.47848   80 
## [22] {11866}          => {1446} 0.0001121856 0.1602564  20.25657   25 
## [23] {12245}          => {1446} 0.0008750477 0.1562500  19.75016  195 
## [24] {2716,2914}      => {1446} 0.0002871951 0.1406593  17.77948   64
##                  lhs       rhs      support confidence     lift count
## [1] {2612,2914,7423} => {1446} 0.0001076982  0.5714286 72.22915    24
## [2]          {11293} => {1446} 0.0001391101  0.5438596 68.74441    31
## [3] {2768,2914,7423} => {1446} 0.0002064215  0.4339623 54.85327    46
## [4]      {2881,2914} => {1446} 0.0003589939  0.4145078 52.39421    80
## [5]     {12245,2768} => {1446} 0.0001166730  0.4126984 52.16550    26
## [6]      {2813,2914} => {1446} 0.0001121856  0.4032258 50.96815    25
head(DT_similar_prod[DT_similar_prod$similar_prod_oid==target_one,],10)
##      similar_prod_oid      score prod_oid
## 352              1446 0.08495964     2768
## 401              1446 0.53691659     2914
## 710              1446 0.06225813     2878
## 1002             1446 0.13146228    12245
## 3200             1446 0.03536536     2716
## 3471             1446 0.05335821     2881
## 4534             1446 0.04000302     2948
## 4600             1446 0.01136833     4406
## 4615             1446 0.04125782     2603
## 5095             1446 0.06334226     7423

—————————————-

d = data.frame( data, stringsAsFactors = FALSE ) data <- datatable(d, filter = ‘bottom’, options = list(pageLength = 5)) data